Search for: All records

Creators/Authors contains: "Bickel, Peter J."

« Prev Next »

Total Resources

7

Resource Type
Conference Paper

0

Conference Proceeding

0

Dataset

0

Journal Article

7

Workshop Report

0

Availability
Full Text / Resource Available

7

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Two provably consistent divide-and-conquer clustering algorithms for large networks

https://doi.org/10.1073/pnas.2100482118

Mukherjee, Soumendu Sundar ; Sarkar, Purnamrita ; Bickel, Peter J. ( October 2021 , Proceedings of the National Academy of Sciences)

In this article, we advance divide-and-conquer strategies for solving the community detection problem in networks. We propose two algorithms that perform clustering on several small subgraphs and finally patch the results into a single clustering. The main advantage of these algorithms is that they significantly bring down the computational cost of traditional algorithms, including spectral clustering, semidefinite programs, modularity-based methods, likelihood-based methods, etc., without losing accuracy, and even improving accuracy at times. These algorithms are also, by nature, parallelizable. Since most traditional algorithms are accurate, and the corresponding optimization problems are much simpler in small problems, our divide-and-conquer methods provide an omnibus recipe for scaling traditional algorithms up to large networks. We prove the consistency of these algorithms under various subgraph selection procedures and perform extensive simulations and real-data analysis to understand the advantages of the divide-and-conquer approach in various settings.

more » « less
Comment: Ridge Regression and Regularization of Large Matrices

https://doi.org/10.1080/00401706.2020.1796815

Le, Can M. ; Levin, Keith ; Bickel, Peter J. ; Levina, Elizaveta ( October 2020 , Technometrics)
null (Ed.)
Full Text Available
Hierarchical Community Detection by Recursive Partitioning

https://doi.org/10.1080/01621459.2020.1833888

Li, Tianxi ; Lei, Lihua ; Bhattacharyya, Sharmodeep ; Van den Berge, Koen ; Sarkar, Purnamrita ; Bickel, Peter J. ; Levina, Elizaveta ( October 2020 , Journal of the American Statistical Association)
null (Ed.)
Full Text Available
Asymptotics for high dimensional regression M-estimates: fixed design results

https://doi.org/10.1007/s00440-017-0824-7

Lei, Lihua ; Bickel, Peter J. ; El Karoui, Noureddine ( December 2018 , Probability Theory and Related Fields)

Full Text Available
Metalearners for estimating heterogeneous treatment effects using machine learning

https://doi.org/10.1073/pnas.1804597116

Künzel, Sören R. ; Sekhon, Jasjeet S. ; Bickel, Peter J. ; Yu, Bin ( February 2019 , Proceedings of the National Academy of Sciences)

There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithms—such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networks—to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms. A software package is provided that implements our methods.

more » « less
Projection pursuit in high dimensions

https://doi.org/10.1073/pnas.1801177115

Bickel, Peter J. ; Kur, Gil ; Nadler, Boaz ( September 2018 , Proceedings of the National Academy of Sciences)
Hypothesis Testing for Automated Community Detection in Networks

https://doi.org/10.1111/rssb.12117

Bickel, Peter J. ; Sarkar, Purnamrita ( May 2015 , Journal of the Royal Statistical Society Series B: Statistical Methodology)

Summary
Community detection in networks is a key exploratory tool with applications in a diverse set of areas, ranging from finding communities in social and biological networks to identifying link farms in the World Wide Web. The problem of finding communities or clusters in a network has received much attention from statistics, physics and computer science. However, most clustering algorithms assume knowledge of the number of clusters k. We propose to determine k automatically in a graph generated from a stochastic block model by using a hypothesis test of independent interest. Our main contribution is twofold; first, we theoretically establish the limiting distribution of the principal eigenvalue of the suitably centred and scaled adjacency matrix and use that distribution for our test of the hypothesis that a random graph is of Erdős–Rényi (noise) type. Secondly, we use this test to design a recursive bipartitioning algorithm, which naturally uncovers nested community structure. Using simulations and quantifiable classification tasks on real world networks with ground truth, we show that our algorithm outperforms state of the art methods.

more » « less